Whose Thumb Is It Anyway? Classifying Author Personality from Weblog Text
نویسندگان
چکیده
We report initial results on the relatively novel task of automatic classification of author personality. Using a corpus of personal weblogs, or ‘blogs’, we investigate the accuracy that can be achieved when classifying authors on four important personality traits. We explore both binary and multiple classification, using differing sets of n-gram features. Results are promising for all four traits examined.
منابع مشابه
Stylistic text classification using functional lexical features
Most text analysis and retrieval work to date has focused on determining the topic of a text, what it is about. However, a text also contains much useful information in its style, or how it is written. This includes information about its author, its purpose, feelings it is meant to evoke, and more. This paper addresses the problem of classifying texts by style (along several different dimension...
متن کاملA Comprehensive Survey on Personality of Authors on Blog Data
A weblog or blog is defined as a “frequently update websites consisting of dated entries arranged in reverse chronological order so that the most recent past appears first”. Blogs are usually maintained by an individual with regular entries of commentary, descriptions of events or other material such as graphics or video. In this paper we have surveyed and analyzed the blog author’s personality...
متن کاملAuthor gender identification from text using Bayesian Random Forest
Nowadays high usage of users from virtual environments and their connection via social networks like Facebook, Instagram, and Twitter shows the necessity of finding out shared subjects in this environment more than before. There are several applications that benefit from reliable methods for inferring age and gender of users in social media. Such applications exist across a wide area of fields,...
متن کاملزیبایی شناسی خطبۀ آفرینش در پرتو نقد فرمالیستی
The present study is an attempt to investigate the first sermon of NahjulBalāgheh in the light of the formalistic criticism. It deals with the issue that the religious meaning and content of the creation sermon should be considered as a foundation, and how much the form and integration have been regarded, and to what amount the literary text has been observed by the author. Ignoring the non-tex...
متن کاملUsing Fuzzy LR Numbers in Bayesian Text Classifier for Classifying Persian Text Documents
Text Classification is an important research field in information retrieval and text mining. The main task in text classification is to assign text documents in predefined categories based on documents’ contents and labeled-training samples. Since word detection is a difficult and time consuming task in Persian language, Bayesian text classifier is an appropriate approach to deal with different...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006